Fast Spectral Clustering of Data Using Sequential Matrix Compression
نویسندگان
چکیده
Spectral clustering has attracted much research interest in recent years since it can yield impressively good clustering results. Traditional spectral clustering algorithms first solve an eigenvalue decomposition problem to get the low-dimensional embedding of the data points, and then apply some heuristic methods such as k-means to get the desired clusters. However, eigenvalue decomposition is very time-consuming, making the overall complexity of spectral clustering very high, and thus preventing spectral clustering from being widely applied in large-scale problems. To tackle this problem, different from traditional algorithms, we propose a very fast and scalable spectral clustering algorithm called the sequential matrix compression (SMC) method. In this algorithm, we scale down the computational complexity of spectral clustering by sequentially reducing the dimension of the Laplacian matrix in the iteration steps with very little loss of accuracy. Experiments showed the feasibility and efficiency of the proposed algorithm.
منابع مشابه
Matrix Sequential Hybrid Credit Scorecard Based on Logistic Regression and Clustering
The Basel II Accord pointed out benefits of credit risk management through internal models to estimate Probability of Default (PD). Banks use default predictions to estimate the loan applicants’ PD. However, in practice, PD is not useful and banks applied credit scorecards for their decision making process. Also the competitive pressures in lending industry forced banks to use profit scorecards...
متن کاملClustering algorithm for audio signals based on the sequential Psim matrix and Tabu Search
Audio signals are a type of high-dimensional data, and their clustering is critical. However, distance calculation failures, inefficient index trees, and cluster overlaps, derived from the equidistance, redundant attribute, and sparsity, respectively, seriously affect the clustering performance. To solve these problems, an audio-signal clustering algorithm based on the sequential Psim matrix an...
متن کاملNon-negative bases in spectral image archiving
This thesis supposes an application of Principal Component Analysis (PCA), Non-negative Matrix Factorization (NMF) and Non-negative Tensor Factorization (NTF) for digital image archiving. It is aimed to develop new efficient methods for spectral image acquisition, compression and retrieval. It hypothesizes that the non-negative bases are more suitable for spectral archiving beside convenient or...
متن کاملModification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis
Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...
متن کاملLarge Scale Spectral Clustering Using Resistance Distance and Spielman-Teng Solvers
Spectral clustering is a novel clustering method which can detect complex shapes of data clusters. However, it requires the eigen decomposition of the graph Laplacian matrix, which is proportion to O(n) and thus is not suitable for large scale systems. Recently, many methods have been proposed to accelerate the computational time of spectral clustering. These approximate methods usually involve...
متن کامل